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A  Characterization  of  t A -Diagnosability  and  Sequential 
/-Diagnosability  in  Designs 

JOO-KANG  LEE  and  JON  T.  BUTLER 

Abstract— A  multiprocessing  system  is  //s-diagnosable  if  all  faulty 
processors  can  be  identified  to  within  s  processors  provided  there 
are  no  more  than  t  faulty  processors.  A  characterization  theorem  of 
Karunanithi  and  Friedman  [4]  for  //s- diagnosability  in  certain  special 
cases  of  systems  called  designs  is  extended  to  the  entire  class  of  DA  t>{n) 
designs.  We  show  that  for  large  t,  s  is  approximately  /2/4 tf.  Further¬ 
more,  the  minimum  number  of  processors  needed  to  attain  a  given 
diagnosability  is  derived. 

A  multiprocessor  system  is  sequentially  /-diagnosable  if  at  least  one 
faulty  processor  can  be  identified  provided  there  are  no  more  than  t 
faulty  processors.  A  theorem  by  Preparata,  Metze,  and  Chien  [7]  giving 
a  sufficient  condition  for  sequential  /-diagnosability  in  the  single  loop 
system,  a  special  case  of  designs,  is  extended  to  the  entire  class  of 
D^tfn)  designs.  We  show  that,  for  large  /,  approximately  t 2  /4/’  nodes 
are  needed  for  a D\tfn)  design  to  be  sequentially  /-diagnosable. 

Index  Terms—  Multiprocessing  systems,  reliable  computing,  systems 
diagnosis,  /-diagnosable,  /A-diagnosable,  testing. 

I.  Introduction 

In  the  systems  diagnosis  approach  to  reliable  computing,  fault  loca¬ 
tion  is  achieved  by  tests  among  processors.  We  assume  that  fault-free 
processors  produce  test  results  that  are  a  true  representation  of  the 
tested  processor,  fail  if  it  is  faulty  and  pass  if  it  is  fault-free.  In  the 
case  of  faulty  processors,  however,  the  test  results  by  such  processors 
may  not  be  correct.  The  goal  is  to  determine  exactly  which  proces¬ 
sors  are  faulty.  However,  if  there  are  too  many  faulty  processors, 
incorrect  test  information  can  cause  ambiguity. 

Our  model  is  that  of  Preparata,  Metze,  and  Chien  [7].  A  system 
S  is  a  directed  graph  where  nodes  represent  processors  and  arcs 
represent  tests  among  processors.  Node  m,  tests  node  Uj  iff  there  is 
a  directed  arc  from  m,  to  Uj .  Each  node  has  one  of  two  states,  faulty 
or  fault-free,  and  each  arc  has  one  of  two  weights,  pass  ox  fail.  For 
example,  Fig.  1  shows  a  system  of  12  nodes  and  two  arrangements  of 
three  faulty  nodes,  which  are  indicated  by  X’s.  Fail  test  outcomes  are 
indicated  by  l’s,  while  unmarked  arcs  correspond  to  pass  outcomes. 

A  system  is  {one-step)  t-diagnosable  if  all  faulty  nodes  can  be 
uniquely  identified  provided  there  are  no  more  than  t  of  them.  For 
example,  the  system  in  Fig.  1  is  not  3-diagnosable  because  the  set 
of  test  outcomes  shown  in  Fig.  1(b),  which  is  produced  with  1/3,  m4, 
and  m5  faulty,  can  also  be  produced  with  just  M3  and  m4  faulty.  Thus, 
if  we  assume  there  are  three  or  fewer  faulty  nodes  in  the  system, 
Ms  cannot  be  uniquely  identified  as  faulty,  /-diagnosability  represents 
worst  case  conditions.  For  example,  the  three  faulty  nodes  in  Fig. 
1(a)  are  uniquely  faulty. 

S  is  a  D5  tr(n )  design  iff  an  arc  exists  from  node  m,  to  Uj  for 
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j  —  i  —  Spmodn,  and  p  assumes  the  values  1,  2,  •  •  •  ,/',  where  n 
is  the  number  of  nodes  [7].  For  example,  the  system  in  Fig.  1  is  a 
£>,  2(12)  design.  From  [7],  D&t(>(n)  is  /'-diagnosable  iffrt  >  2/'  +  l. 
Thus,  the  system  of  Fig.  1  is  2-diagnosable.  Preparata,  Metze,  and 
Chien  [7]  observe  that,  when  5  and  n  are  relatively  prime,  a  Db  J'{ri) 
design  is  isomorphic  to  a  Dx  t>{n)  design.  Thus,  the  diagnosability 
of  the  former  is  identical  to  that  of  the  latter.  In  the  following,  we 
restrict  our  attention  to  £>,  ,'(«)  designs,  with  the  recognition  that  a 
larger  class  of  systems  is  characterized. 

When  the  number  of  faulty  nodes  exceeds  /  in  a  /-diagnosable 
system,  it  may  be  necessary  to  replace  fault- free  nodes  in  order  to 
replace  all  faulty  nodes.  A  system  is  tls-diagnosable  iff  all  faulty 
nodes  can  be  identified  to  within  a  set  of  5  nodes,  provided  there 
are  no  more  than  /  faulty  nodes,  s  depends  on  /.  For  example,  from 
previously  published  results  [4]  and  from  results  in  this  paper,  we 
can  conclude  that  £>i,2(12)  is  //5-diagnosable  for  t/s  =  1/1,  2/2, 
3/4,  4/6,  5/8,  and  j/12,  where  6  <  /  <  12.  The  last  result,  i/12,  also 
follows  from  an  observation  in  [7],  that  when  the  number  of  faulty 
nodes  equals  or  exceeds  the  number  of  fault-free  nodes,  the  ambiguity 
of  the  fault/fault-free  status  of  nodes  can  extend  to  the  entire  set  of 
nodes. 

Building  on  results  of  Freidman  [2],  Karunanithi  and  Friedman 
[4]  characterize  t/s-diagnosability  in  two  special  cases  of  D]t>(n) 
designs, 


1)  /'  =  1,  and 


(1) 


2)  /'  > 


(2) 


That  is,  for  these  two  cases,  an  expression  is  derived  for  s  as  a 
function  of  /  when  a  minimal  number  nmin  of  nodes  exists.  Fur¬ 
thermore,  an  expression  for  nmin  is  derived.  We  extend  this  result 
to  2  <  /'  <  |//2J,  covering  all  other  cases  of  Dxt,(n)  designs.  We 
show  that,  for  such  designs,  both  s  and  nmin  are  approximately  F  j  At' 
when  t  is  large.  Thus,  the  status  of  almost  all  nodes  in  designs  with 
a  near  minimal  number  of  nodes  can  be  uncertain  in  the  worst  case. 

A  system  is  sequentially  t-diagnosable  iff  at  least  one  faulty 
node  can  be  identified  provided  there  are  no  more  than  t  of  them. 
Preparata,  Metze,  and  Chien  [7]  show  a  lower  bound  v  on  the  num¬ 
ber  of  nodes  n  in  a  special  case  of  Dxx{n)  designs,  called  single 
loop  systems,  such  that  such  systems  are  sequentially  /-diagnosable. 
That  is,  it  is  shown  that  if  n  >v,  then  Dxx(n)  is  sequentially  /- 
diagnosable,  where  v  depends  on  t' .  We  extend  this  result  to  all 
D\  (i (n)  designs.  For  example,  £>i>2(12)  in  Fig.  1  is  sequentially 

5- diagnosable.  Specifically,  we  show  a  lower  bound  nmm  on  the 
number  of  nodes  n  such  that,  if  n  >  nmin,  a  Dxt>{n)  design  is  se¬ 
quentially  /-diagnosable.  For  example,  £>)>2(ai)  is  sequentially  5-  and 

6- diagnosable  forn  >11  and  n  >  13,  respectively. 

Neither  //5-diagnosability  nor  sequential  /-diagnosability  have 
been  characterized  in  general  systems.  Chwa  and  Hakimi  [1]  char¬ 
acterize  ///-diagnosability,  a  topic  originally  studied  by  Kavianpour 
and  Friedman  [5],  Yang,  Masson,  and  Leonetti  [10]  give  a  polyno¬ 


mial  time  algorithm,  in  which  all  faulty  nodes  in  a  ///-diagnosable 
system  can  be  identified  except  perhaps  at  most  one  node,  whose 
status  is  in  doubt.  Manber  [6]  extends  the  class  of  known  sequen¬ 
tially  /-diagnosable  systems  to  certain  strongly  connected  systems. 
Somani,  Agrawal,  and  Davis  [8]  characterize  the  diagnosability  of 
fault  sets  in  systems.  Sullivan  [9]  was  the  first  to  give  necessary  and 
sufficient  conditions  for  /-diagnosability  in  general  systems  which 
can  be  checked  in  polynomial  time,  unlike  previous  exponential  time 
conditions  [3]. 


II.  Background 

We  can  divide  nodes  into  three  categories. 

Definition:  Given  a  system,  a  set  of  test  outcomes  a,  and  an 
integer  /,  a  node  u  is  definitely  good  {definitely  bad)  with  respect 
to  a  if  the  assumption  that  u  is  faulty  (fault-free)  implies  there  are 
more  than  /  faulty  nodes.  A  node  which  is  neither  definitely  good 
nor  definitely  bad  is  suspect. 

For  example,  for  /  =3  and  for  the  set  of  test  outcomes  shown 
in  Fig.  1(b),  u2,  u3,  and  u5  is  a  definitely  good,  definitely  bad, 
and  suspect  node,  respectively.  Note  that  for  any  set  of  test  out¬ 
comes  produced  by  any  arrangement  of  /  or  fewer  faulty  nodes  in  a 
/-diagnosable  system,  the  definitely  bad  nodes  correspond  exactly  to 
the  faulty  nodes,  when  /  =  /.  Furthermore,  there  are  no  suspects. 
The  /-diagnosability  of  the  system  precludes  such  ambiguity.  There 
can  be  as  many  as  s  suspects  in  a  /  /5-diagnosable  system.  For  exam¬ 
ple,  in  the  £>i>2(12)  system  of  Fig.  1,  which  is  // 12 -diagnosable  for 
6  <  i  <  12,  when  there  are  six  or  more  faulty  nodes,  each  of  the  12 
nodes  in  the  system  is  suspect  if  all  faulty  nodes  fail  fault- free  nodes 
they  test  and  pass  faulty  nodes  they  test.  From  the  results  shown  be¬ 
low,  if  at  least  one  faulty  node  can  be  identified  in  a  DXJ>{n)  design, 
there  can  be  as  many  as  5  -  /'  suspects. 

Let  F  denote  a  set  of  faulty  nodes  in  a  system  S,  where  \F\  <  /. 
Let  a  be  a  syndrome  or  set  of  test  outcomes  produced  by  F.  FR  is 
a  replacement  set  generated  by  F  through  a  if 

FR=uFj  (3) 

i 

where  F,  produces  a  and  |F/|  <  /.  It  follows  that  u  £  FR  iff  u  is 
definitely  bad  or  suspect  with  respect  to  a.  The  term  replacement 
set  is  used  to  indicate  that,  in  order  to  replace  all  faulty  nodes,  all 
nodes  in  the  replacement  set  must  be  replaced  by  fault-free  nodes. 
Each  definitely  bad  node  is  common  to  all  F,  ,  while  each  suspect 
is  missing  from  at  least  one  F, .  FR  is  a  maximal  replacement  set 
if  there  is  no  larger  replacement  set  with  respect  to  any  set  of  test 
outcomes  a  produced  by  any  fault  set  of  /  or  fewer  nodes.  In  a  t/s- 
diagnosable  system,  s  —  |FF|,  where  FR  is  a  maximal  replacement 
set. 


III.  Results 

The  diagnosability  of  a  system  reflects  worst  case  conditions.  That 
is,  in  a  //5-diagnosable  system,  all  faulty  nodes  can  always  be  iden¬ 
tified  to  within  a  set  of  size  s  provided  that  there  are  no  more  than 
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t  faulty  nodes.  However,  for  a  specific  arrangement  of  faulty  nodes, 
it  may  be  possible  to  identify  the  faulty  nodes  to  within  a  set  of  size 
smaller  than  s. 

Our  main  result,  Theorem  1,  gives  necessary  and  sufficient  con¬ 
ditions  for  a  Dlft>(n)  design  to  be  //s-diagnosable.  We  proceed  by 
showing  worst  case  conditions,  the  largest  replacement  set  among 
all  replacement  sets  associated  with  fault  sets  of  t  or  fewer  nodes. 
Lemma  1  shows  that  such  a  set  consists  of  consecutive  nodes.  Lem¬ 
mas  2  and  3  give  characteristics  of  a  certain  fault  set  which  produces 
the  largest  replacement  set.  Theorem  1  establishes  the  size  of  the 
largest  replacement  set. 

Lemma  1:  Let  FR  be  a  maximal  replacement  set  in  a  DX  t>{n) 
design  corresponding  to  a  set  of  test  outcomes  produced  by  a  fault  set 
of  size  t  or  smaller,  where  t  >  t' .  Then,  FR  is  a  set  of  consecutive 
nodes. 

Proof.  On  the  contrary,  assume  there  exists  a  maximal  replace¬ 
ment  set  FR  consisting  of  p  >  2  segments,  B0,  Bu  ,BP- , ,  of 
definitely  bad  and  suspect  nodes  separated  by  definitely  good  nodes, 
where  the  direction  of  tests  is  toward  increasing  index.  Because  of  the 
intervening  definitely  good  nodes,  if  \Bj\  <  t' ,  all  nodes  in  Bj  are 
definitely  bad,  and  if  \Bj  |  >  F,  the  first  t'  nodes  are  definitely  bad, 
while  the  remaining  are  suspect.  Furthermore,  since  t  >  t' ,  there  is  at 
least  one  suspect  in  a  maximal  replacement  set,  FR.  On  the  contrary, 
if  all  nodes  in  FR  are  definitely  bad,  \FR\  <  t ,  since  there  can  be 
no  more  definitely  bad  nodes  than  faulty  nodes.  However,  \FR\  >  t, 
as  illustrated  by  the  following;  consider  syndrome  a  produced  by  a 
sequence  of  t  consecutive  faulty  nodes  F  =  {uk ,  w*+1 ,  •  •  • ,  u^+t-x  }, 
where  all  test  results  by  faulty  nodes  are  pass  except  for  the  test  of 
fault-free  node  uk+t  by  faulty  node  uk+t-l ,  which  is  fail.  Node  uk+t 
is  tested  by  faulty  nodes  exclusively,  and,  since  t  >  tf,  its  faulty /fault- 
free  status  cannot  be  uniquely  determined.  The  replacement  set  in 
this  case  contains  at  least  /  +  1  elements.  So  also  does  a  maximal 
replacement  set,  FR.  Since  there  is  at  least  one  suspect,  there  is  at 
least  one  segment  B ,  containing  a  suspect,  and  |B,-|  >  t. 

Let  G,  be  the  definitely  good  nodes  between  B,  and  B,+1 ,  where 
index  addition  is  mod/?.  Since  the  direction  of  tests  is  toward  in¬ 
creasing  index,  nodes  in  B,  test  nodes  in  G,,  and  nodes  in  G,  test 
nodes  in  Bi+i.  Given  a  sequence  of  nodes  B,  *B  is  B  if  |B|  <  t ' 
and  is  the  first  t'  nodes  of  B  beginning  with  the  (unique)  node  in 
B  not  tested  by  another  node  in  B,  if  \B\  >  F.  We  now  show  that 
a  sequence  of  |*B,+1|  nodes  in  G,  nearest  B,  can  be  converted  to 
suspect  nodes. 

Indeed  |G,- 1  >  |*B/+1 1,  as  follows.  Nodes  that  are  definitely  bad  in 
FR  correspond  to  nodes  that  are  faulty  in  all  fault  sets  which  gener¬ 
ate  FR.  Thus,  if  F  is  a  fault  set  which  generates  FR  through  a,  then 
tests  by  such  nodes  are  arbitrary.  Consider  the  case  where  tests  by 
definitely  bad  nodes  are  all  pass.  All  nodes  in  G>  are  tested  by  defi¬ 
nitely  bad  and  suspect  nodes  in  B, .  Furthermore,  all  suspects  in  B, 
which  test  nodes  in  G,  must  produce  pass  test  outcomes;  no  suspect 
fails  a  definitely  good  node.  Then,  regardless  of  other  test  outcomes, 
Ff  =  F  U G/  — *  B;+1  is  a  fault  set  which  generates  FR  through  a. 
If  |G,-|  <  |*B,+i  |,  |F'|  <  |F|  <  t,  and  nodes  in  G,  are  suspect,  not 
definitely  good,  as  assumed.  Then,  it  must  be  that  |G,  |  >  |*B,+i|. 

Let  G**  be  the  |*B,-+1 1  nodes  in  G,  just  preceding  B,+J .  Form 
a  new  system  by  removing  the  sequence  of  nodes  G,  -G**  along 
with  tests  by  these  nodes  and  inserting  them  immediately  before  the 
definitely  bad  nodes  in  B,.  The  severed  tests  are  applied  in  their 
original  order,  so  that  the  resulting  system  is  a  DXyt>{n)  design. 
Retain  the  outcomes  of  all  tests,  except 

1)  Test  outcomes  of  tests  applied  by  nodes  in  G,  -  G**  after 
rearrangement  agree  with  the  definitely  good  node  immediately  pre¬ 
ceding  B,  before  rearrangement, 

2)  Test  outcomes  of  tests  applied  by  nodes  in  G,_i  to  nodes  in  B, 
after  rearrangement  agree  with  the  test  outcomes  before  rearrange¬ 
ment  of  the  definitely  good  node  immediately  preceding  B, .  Test 
outcomes  of  the  tests  applied  by  nodes  in  G,_!  to  nodes  G,  -  G** 
after  rearrangement  are  all  pass,  and 

3)  Test  outcomes  of  tests  applied  by  nodes  in  B,  to  nodes  in  G** 
and  follow  on  nodes  agree  with  the  test  results  of  the  node  in  G,  - 
G**  immediately  preceding  G**  before  rearrangement. 
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By  virtue  of  the  choice  of  test  outcomes,  a  fault  set  F  consis¬ 
tent  with  the  syndrome  before  rearrangement  is  consistent  after. 
However,  F'  =  FUG,**  —*  B,+i  is  now  also  consistent.  Since 
|F'|  =  \F\  </,  nodes  in  G**  are  now  suspect,  as  are  nodes  in 
*Bt>  i .  Thus,  the  total  number  of  definitely  bad  nodes  and  suspects 
is  larger.  It  follows  that  FR  is  not  a  maximal  replacement  set  as 
assumed.  Q.E.D. 

For  the  interested  reader,  the  Appendix  illustrates  the  proof  of 
Lemma  1  using  a  specific  design.  In  a  t /s-diagnosable  system,  if 
there  are  t  or  fewer  faulty  nodes,  their  location  extends,  in  the  worst 
case,  to  a  set  of  s  nodes.  From  Lemma  1 ,  this  worst  case  corresponds 
to  consecutive  nodes.  The  next  two  lemmas  concern  the  characteris¬ 
tics  of  fault  sets  for  this  worst  case  situation. 

Lemma  2:  Let  FR  be  a  maximal  replacement  set  in  a  D]ylfn) 
design  with  t  >  F  faulty  nodes.  If  there  is  at  least  one  definitely 
good  node,  then  there  exists  a  fault  set  F  which  generates  FR  such 
that  each  faulty  node  in  F  belongs  to  a  sequence  of  consecutive  faulty 
nodes  of  length  i'  or  more. 

Proof.  Assume  there  is  at  least  one  definitely  good  node,  and 
let  FR  be  a  maximal  replacement  set  in  Let  a  be  a  set 

of  test  outcomes  and  F'  be  a  set  of  t  or  fewer  nodes  such  that  F' 
generates  FR  through  a.  F'  consists  of  segments  F0,  F\ ,  ■  •  •  ,Fq-\ 
of  consecutive  faulty  nodes  separated  by  fault-free  nodes,  where  the 
direction  of  tests  is  toward  increasing  index.  We  proceed  by  showing 
that,  if  there  is  at  least  one  segment  F,  such  that  |F,-|  <  F,  then 
there  is  another  fault  set  F  which  generates  FR,  where  all  nodes  in 
F  belong  to  a  sequence  of  consecutive  faulty  nodes  each  of  length  t' 
or  more. 

It  is  sufficient  to  show  that  under  the  above  conditions,  we  can 
join  F,  with  F,_i  without  changing  FR,  where  index  subtraction  is 
mod  q .  Indeed,  there  is  at  least  one  other  segment,  since  |F,-|  <  tf 
and  t  >  t'.  Let  FF,  be  the  segment  of  fault-free  nodes  immediately 
following  F, ,  where  the  direction  of  tests  is  from  F,  towards  FF, . 

Since  there  is  at  least  one  definitely  good  node,  there  is  a  def¬ 
initely  good  node  u  immediately  preceding  FR.  From  Lemma  1, 
FR  is  a  set  of  consecutive  nodes,  and  it  has  length  greater  than 
t' .  Thus,  all  outcomes  of  tests  by  u  are  fail.  On  the  contrary,  if 
any  test  outcome  is  pass,  the  tested  node  must  be  definitely  good. 
Thus,  the  first  t'  nodes  of  FR  are  definitely  bad,  while  all  subsequent 
nodes  in  FR  are  suspect.  Since  |F,  |  <  F,  F,  does  not  contain  the 
first  tf  definitely  bad  nodes,  and  so  F,  contains  only  suspect  nodes, 
while  F;_ i  consists  of  suspects  and/or  definitely  bad  nodes.  Sim¬ 
ilarly,  FF,  consists  of  suspects  and/or  definitely  good  nodes.  Let 
FFj-i/j  =  FF,_i  U  {suspects  in  FF,}  —  {m0,  «i,  ■  •  ■  ,ug}.  Fur¬ 
thermore,  assume  that  the  indexes  correspond  to  the  natural  order 
of  nodes  as  determined  by  tests.  That  is,  M0,  U\ ,  •  •  • ,  and  U\FF._y\_x 
correspond  to  the  nodes  of  FF,_! ,  such  that  u0  tests  Mi,  Mj  tests 
U-2 ,  etc.  Similarly,  is  the  first  suspect  in  FF,,  U\FFi_} |+1 

is  the  second,  etc.  Because  nodes  in  FF;_j  and  FF,  are  fault- 
free,  all  tests  among  nodes  in  FF,_!/,  are  pass.  Since  ug  is  sus¬ 
pect,  there  is  a  fault  set  F",  where  |F"|  <  t  which  generates  FR 
through  a ,  such  that  ug  eF".  Since  there  is  a  path  of  pass  test 
outcomes  from  any  Uj  €FF,_!/,  to  ug,uj  eF".  Thus,  ug  eF" 
implies  FF,  _!/,*  CF". 

Besides  F'  and  F” ,  there  are  other  sets  which  generate  FR 
through  a.  Fn ,  which  contains  the  first  k  nodes  of  FF,_w,,  where 
1  <  k  <  |FF, ■_!/,•  |,  is  consistent  with  a.  However,  F"  UF  "  =  F" , 
and  since  \F,f\  <t,  then  | F”'\  <t.  Therefore,  it  is  sufficient  to  con¬ 
sider  fault  sets  which  contain  all  members  of  FF,-!/,  or  no  members 
of  FF,_i//. 

Note  that  this  observation  is  independent  of  the  position  of  F, 
within  FF,  _i//.  Specifically,  the  following  rearrangement  of  nodes 
and  tests  leaves  FR  unchanged,  but  produces  a  fault  set  generating 
FR  which  is  the  same  as  F1  except  F,-_]  and  F,  are  combined  as 
a  single  sequence  of  faulty  nodes.  That  is,  all  nodes  in  F,  plus  all 
tests  by  F,  are  inserted  between  F,_i  and  FF,-_j.  All  tests  are 
reconnected  in  their  natural  order,  and  all  test  results  among  faulty 
nodes  after  rearrangement  are  pass.  Q.E.D. 

From  Lemma  1,  a  maximal  replacement  set  FR  in  a  D\  t>{n) 
design  corresponding  to  a  set  of  t  >  t’  faulty  nodes  consists  of  con- 
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secutive  nodes.  From  Lemma  2,  there  is  a  fault  set  F  which  generates 
FR  where  all  nodes  in  F  belong  to  segments  of  consecutive  nodes  of 
length  at  least  t’ .  Also,  from  the  proof,  it  is  clear  that  F  generates  FR 
through  a  set  of  test  outcomes  a  where  no  faulty  node  fails  another 
faulty  node.  Thus,  fail  test  outcomes  in  a  occur  only  between  nodes 
of  different  status,  and  a  consists  of  c  groups  of  fail  test  outcomes 
separated  by  pass  test  outcomes.  Let  C  be  a  canonic  fault  set  which 
generates  a  maximal  replacement  set  FR  iff  all  nodes  in  C  belong  to 
segments  of  consecutive  nodes  of  length  F  or  greater  with  at  most 
one  segment  having  length  strictly  greater  than  t' .  The  existence  of 
C  is  assured  because  it  is  consistent  with  o.  That  is,  each  of  the 
c  groups  of  fail  test  outcomes  in  a  corresponds  to  t'  faulty  nodes 
with  the  last  group  (farthest  from  the  definitely  bad  nodes  in  FR  in 
the  direction  of  tests)  having  an  additional  t  -  t'c  consecutive  faulty 
nodes. 

Lemma  3:  Let  FR  be  a  maximal  replacement  set  in  a  Dx  t>{n) 
design  with  t  >  t'  faulty  nodes.  If  there  exists  at  least  one  definitely 
good  node,  a  segment  of  suspect  fault- free  nodes  FFi  that  follows 
a  segment  of  faulty  nodes  in  a  canonic  fault  set  which  generates  FR 
has  the  property 

\FF,\=t~t'c  (4) 

where  c  €  {\t \t/2t'\}  such  that  c(t  -t'c)+t  has  maximum 
value. 

Proof:  Let  F0i  Fx,  •  •  • ,  and  Fk  be  the  segments  of  consecutive 
faulty  nodes  in  C,  and  let  FFi  be  the  fault-free  nodes  between  F, 
and  F/+] .  Because  t  >  t',  k  >  1.  If  there  is  at  least  one  definitely 
good  node,  one  segment  is  definitely  bad.  Let  F0  be  the  set  of 
F  definitely  bad  nodes  in  FR ,  and  let  a  be  the  syndrome  which 
generates  FR  in  which  all  faulty  nodes  produce  pass  test  outcomes. 
Since  Fi+X ,  for  0  <  /  <  k  —  1,  is  a  set  of  suspects,  there  is  another 
fault  set  not  containing  Fi+X.  But  this  implies  containment  of  FFi. 
Since  |F,|  =  t’  (by  Lemma  2  and  the  definition  of  C),  there  are 
no  tests  between  adjacent  FF/  s.  Thus,  a  smallest  fault  set  F  such 
that  Fi+X  g  F  is  F  =  C  -  L  -  Fi+X  U  FF, ,  where  L  is  the  last 
t  —  t'c  —  t'  faulty  nodes  in  Fk  in  the  direction  of  tests.  The  fail 
test  outcomes  at  the  site  of  each  of  the  k  other  F,  imply  at  least  tf 
faulty  nodes.  Since  \C-L\  =  t'(k  4- 1)  and  \FFi+x  \  -  F,  it  follows 
that  | FFi  |  +  fk<t.  Thus,  | FFi  |  <  t  -  Fk.  But  FR  is  a  maximal 
replacement  set,  and  so 

I  FFi  |max  —t—t  k ,  (5) 


From  (5)  and  (6),  we  have 

s  ~  kit  —  Fk)  + 1  (7) 

where  k  is  chosen  so  that  ds/dk  is  0.  Thus,  k  =  \t/2t ']  or  [t/2t'\. 

Q.E.D. 

Lemma  2  and  the  observations  that  follow  it  show  that  the  canonic 
fault  set  C  generates  a  maximal  replacement  set  FR.  Lemma  3  shows 
that  the  number  of  fault-free  nodes  separating  segments  of  faulty 
nodes  in  C  has  some  maximum  value,  t  —t'c.  Fig.  2  shows  the 
canonic  fault  set  C  and  a  syndrome  o  produced  by  it.  Each  column 
associated  with  a  corresponds  to  the  test  results  of  the  node  just  above 
the  column.  0  is  pass  and  1  is  fail.  That  is,  in  a  Dx  t>{ri)  design, 
node  Ui  tests  Uj  iff  j  —  i  =  p  modn,  where  p  —  1,  2,  •  •  ■  ,F.  The 
top  row  of  test  results  corresponds  to  p  =  1 ,  the  second  corresponds 
to  p  =  2,  •  •• ,  and  the  last  corresponds  to  p  —  t' .  Thus,  the  leftmost 
node  in  F,  fails  all  tests  applied  to  it,  since  these  are  by  fault-free 
nodes.  The  next  node  fails  all  but  one  test,  that  by  the  faulty  node 
just  to  its  left,  etc.  Fig.  2  also  shows  other  fault  sets  Cx ,  C2,  ■  ■ , 
and  Cc  which  generate  o .  Note  that  in  C,  the  nodes  just  preceding 
Fj  in  C  are  faulty.  These  are  nodes  which  are  fault-free  with  respect 
to  F,  and  thus  are  suspect  nodes,  since  |C,  |  =  t.  Since  these  nodes 
are  fault-free  in  C,  they  are  suspect  nodes.  Fig.  2  also  shows  C', 
the  fault  set  with  fewest  nodes  (c  +  1)/'  which  generates  FR. 

Since  nodes  in  R  are  definitely  good,  the  assumption  that  any  one 
is  faulty  leads  to  the  conclusion  that  there  are  more  than  t  faulty 
nodes  in  the  system.  This  imposes  a  lower  bound  on  the  size  of  R. 
For  example,  if  the  node  in  R  immediately  preceding  F0  is  faulty, 
then  so  also  are  all  nodes  in  R,  as  well  as  all  nodes  in  the  segment 
of  t  -  t'c  —  t'  nodes  labeled  L  in  Fig.  2.  The  smallest  number  of 
remaining  faulty  nodes  that  is  consistent  with  the  fail  test  outcomes 
is  ct' ,  all  nodes  in  C'  less  those  in  F o.  Thus,  we  require  |F|+f  — 
t'c  -  t'  +  t'c  =  |/?|  + 1  —  F  >  t  or 

1*1  >*'.  (8) 

This  observation  is  a  part  of  the  proof  of  Theorem  1 . 

Theorem  1:  Design  Dx  t>{n)  is  t /s-diagnosable  iff 

n  >  nmitl  =s  +min(r,  /')  +  1  (9) 

where 


and 

s  =k\FFi\m^+t.  (6) 
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Proof:  There  are  two  cases,  t  <  t'  and  t  >  t' .  For  t  <t',s  =  t 
and  the  inequality  becomes  n  >  nm in  =  It  +  1 ,  which  is  neces¬ 
sary  and  sufficient  for  t /s-diagnosability  in  Dutf(n)  designs,  where 
[t/2\  <t* ,  as  given  in  Theorem  3  of  [4].  (The  expansion  for  s  in 
Theorem  3  of  [4],  s  —2t  — 1' ,  is  valid  only  for  t  >  tf .  For  t  <  t', 
the  correct  expression  is  s  =  t.) 

Now  consider  the  second  case. 

(if)  Assume  the  condition  holds,  but  S  is  not  //s-diagnosable. 
Then,  there  exists  a  fault  set  F  where  |F|  <  /  which  generates  a 
replacement  set  FR  such  that  \FR\  >s,  where  s  is  given  in  (10). 
However,  it  follows  from  Lemma  3  that,  if  there  is  at  least  one 
definitely  good  node,  a  maximal  replacement  set  FR'  consists  of  c 
segments  of  nodes  that  are  fault-free  in  the  canonic  fault  set,  each 
having  size  t  —  t'c ,  plus  the  t  faulty  nodes  in  C,  for  a  total  of 
c(t  —  t'c)  +t  nodes,  where  c  is  either  |7/2/'"|  or  \t/2tf  J  depending 
on  which  produces  the  maximum  value  of  c(t  —  t'c)  Ft.  Thus,  it  must 
be  that  there  is  no  definitely  good  node,  and  all  nodes  are  suspect. 
Specifically,  nodes  in  R  —  V  —FR' ,  where  V  is  the  set  of  all  nodes, 
are  suspect.  We  now  show  that  this  leads  to  a  contradiction,  and  it 
must  be  that  S  is  indeed  //s-diagnosable. 

It  follows  that  \R\  =  n  -s.  Since  t  >  t',  min(/,  t')  =  t',  and  from 
(9),  n  -s  >t'  + 1 .  Thus,  \R\  >t'  + 1 .  Since  nodes  in  R  are  suspect, 
the  set  F'  =  C  UR  -  F0  is  a  set  of  smallest  size  where  nodes  in  R 
are  faulty  which  is  consistent  with  a  set  of  test  outcomes  produced  by 
C  and  where  F 0  is  the  set  of  nodes  that  would  be  definitely  bad  if  at 
least  one  definitely  good  node  exists.  Since  C  HR  =  0  and  Fo  C  C, 
we  have 

\F'\  =  |tf|  +  t  -  |F0|  >  t'  +  1  + 1  -  t'  =  t  +  1  (11) 

which  is  a  contradiction. 

(only  if)  Suppose  that  S  is  //s -diagnosable,  but  n  <  s  + 1'  +  1. 
Since  n  <  s  + 1'  +  1,  the  set  R  of  definitely  good  nodes  is  no  larger 
than  t'  in  the  worst  case  of  a  replacement  set  of  s  nodes,  where  s  is 
given  in  the  hypothesis.  However,  the  set  of  test  outcomes  produced 
by  a  canonic  fault  set  C  can  also  be  produced  by  F'  =  C  UR  -F0. 
Since  \F'\  =  |F|  =  t,  R  consists  of  suspects,  not  definitely  good 
nodes.  Q.E.D. 

For  the  special  case  of  t'  —  1 ,  Theorem  1  applies  to  the  single  loop 


system.  The  statement  in  this  case  is  identical  to  that  of  Theorem  1  of 
[4].  It  follows  from  Theorem  1  that  in  a  DX  J'(n)  design  if  n  <  «min, 
then  there  exists  a  set  of  t  faulty  nodes  and  a  set  of  test  outcomes 
such  that  this  system  is  not  sequentially  /-diagnosable.  Conversely, 
from  Theorem  1,  if  n  >  nmin,  there  is  no  arrangement  of  t  faulty 
nodes  and  no  set  of  test  outcomes  such  that  all  nodes  are  suspect. 
Since  at  least  one  faulty  node  can  be  identified,  such  a  system  is 
sequentially  ^-diagnosable.  Thus,  we  have  the  following. 

Corollary:  Design  DX  Jt(n)  is  sequentially  /-diagnosable  iff 

n>  Mmin  -s  +  min (/,'/')  +  1  (12) 


where 


IV.  Concluding  Remarks 

Table  I  shows  the  value  of  s  and  rtmin  such  that  a  D1>r/(n)  design  is 
t/s -diagnosable  for  n  >  nm in.  t’  varies  across  the  columns  and  t 
varies  down  the  rows.  Each  entry  is  s/sm-m.  The  column  headed  by 
t'  =  1  corresponds  to  a  single  loop  system  and  /imin  in  this  column 
agrees,  as  it  should,  with  the  values  derived  by  Preparata,  Metze,  and 
Chien  [7]  for  the  lower  bound  on  n  such  that  a  single  loop  system 
is  sequentially  /-diagnosable.  The  nonbold  data  represent  previous 
results.  For  example,  the  nonbold  data  associated  with  t'  >  \t /2\  is 
that  of  Karunanithi  and  Friedman  [4] ,  while  the  nonbold  data  asso¬ 
ciated  with  t'  =  1  is  from  [4]  and  [7].  The  bold  data  represent  data 
from  the  results  of  this  paper  not  covered  by  these  previous  papers. 

Fig.  3  shows  a  three-dimensional  plot  of  s  versus  /  with  t'  as  a 
parameter  for  1  <  t'  <  10.  The  thin  lines  represent  the  data  derived 
from  the  results  of  Karunanithi  and  Friedman  [4],  while  the  heavy 
lines  represent  the  data  derived  by  the  results  of  this  paper.  This 
shows  that,  compared  to  higher  order  designs,  it  is  much  more  diffi¬ 
cult  to  locate  faults  in  single  loop  systems.  That  is,  as  one  progresses 
towards  designs  with  more  tests,  a  smaller  maximal  replacement  set 
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Fig.  3.  s  versus  t  and  /'  required  for  t /j-diagnosability  in  Dl  (>  (n)  designs, 
for  1  <  tf  <  10  and  1  <  t  <  20. 


Fig.  4.  /imin  versus  t  and  t'  required  for  t/s-diagnosability  in  Dx  (>{n) 
designs,  for  1  <  F  <  10  and  1  <  /  <  20,  where  n  >  nm\n. 

is  required  for  some  fixed  number  of  faults  in  the  design.  However, 
a  point  of  diminishing  returns  is  reached,  where  added  tests  produce 
only  marginally  smaller  maximal  replacement  sets. 

We  can  obtain  a  simple  expression  for  5  as  a  function  of  t  and 
t '  for  large  t.  Let  g(t)  ~  h(t)  mean  lim,_>oo  g(t)/h(t)  =  1.  Then, 
the  expressions  within  the  ceiling  and  floor  brackets  of  (10)  can  be 
replaced  as  follows:  [t/2t'\  ~  t /2t*  and  \t/2t']  ~t/2t',  in  which 
case,  the  arguments  of  the  max  operator  in  (10)  have  the  same  form, 
and  we  can  write 


For  large  t ,  s  is  directly  proportional  to  t2  and  inversely  proportional 
to  Thus,  the  curves  for  fixed  t’  in  Fig.  3  are  approximately  half 
parabolas.  This  is  most  evident  in  the  curve  for  t'  —  1.  It  is  also 
worth  noting  that  for  small  t,  specifically,  t  <  t\  s  is  the  linear 
function  s  =  t ,  since,  for  this  case,  all  faulty  nodes  can  be  uniquely 
identified.  This  is  most  evident  in  the  curve  for  tf  =  10. 

Fig.  4  shows  a  three-dimensional  plot  of  «min  versus  t’  and  t.  This 
resembles  the  plot  of  Fig.  3  and  shows  the  large  influence  of  the  s 
term  in  the  expression  for  nmin.  The  thin  lines  represent  data  due 
to  Karunanithi  and  Friedman  [4]  and  Preparata,  Metze,  and  Chien 


[7],  while  the  heavy  lines  represent  data  derived  from  results  of  this 
paper.  Similarly,  it  can  be  seen  that 

t2 

Mmin  ~  T77  *  (1^) 

4r 

Thus,  for  large  t,  approximately  t2/4t'  nodes  are  necessary  for  a 
Di design  to  be  sequentially  /-diagno sable. 

It  is  interesting  that  as  little  as  tf  +  1  definitely  good  nodes  can 
exist  in  a  t  /s-diagnosable  system  (the  minimum  number  of  nodes  in 
R ,  as  shown  in  Theorem  1)  and  that  as  little  as  t'  definitely  bad  nodes 
can  exist.  So,  while  the  number  of  suspects  grows  quadratically,  the 
minimum  number  of  definitely  good  and  definitely  bad  nodes  remains 
constant.  Thus,  as  t  increases,  the  fraction  of  the  total  number  of 
nodes  that  are  suspect  can  approach  100%. 

Appendix 

Example  Illustrating  the  Proof  of  Lemma  1 

Lemma  1:  Let  FR  be  a  maximal  replacement  set  in  a  Dl  t/(n) 
design  corresponding  to  a  set  of  test  outcomes  produced  by  a  fault  set 
of  size  t  or  smaller,  where  t  >  t'.  Then,  FR  is  a  set  of  consecutive 
nodes. 

Proof:  Proceeds  by  contradiction.  That  is,  we  assume  there  ex¬ 
ists  a  maximal  replacement  set  FR  which  does  not  consist  of  consec¬ 
utive  nodes  and  show  that  this  is  impossible.  Specifically,  we  show 
that  we  can  rearrange  certain  nodes  (without  changing  their  fault- 
free/faulty  status)  to  produce  a  replacement  set  that  is  larger  than 
FR. 

As  an  example  of  the  proof,  consider  the  £>1,2(19)  design  shown 
in  Fig.  5.  The  syndrome  shown  consists  of  six  fail  test  outcomes 
(indicated  by  1).  If  t  =  8,  there  can  be  at  most  eight  faulty  nodes. 
With  /  =  8,  there  are  four  definitely  bad  nodes  divided  into  two 
subsets  (m4,  u5}  and  {mi3,  Mu}  (indicated  by  shading  consisting  of 
vertical  lines).  These  are  definitely  bad  because,  if  any  one  is  fault- 
free,  there  are  more  than  eight  faulty  nodes.  For  example,  if  m]3  is 
fault- free,  uX2  is  faulty,  having  passed  m]3.  Similarly,  U\\  is  faulty, 
having  passed  m12,  etc.  Indeed,  if  ul3  is  fault-free,  there  are  at  least 
seven  other  nodes  that  are  faulty.  There  are  seven  definitely  good 
units  divided  into  two  subsets  { u0 ,  u  1,  m2,  m3}  and  {mjo,  «n,  W12} 
(indicated  by  the  absence  of  shading).  These  are  definitely  good, 
since  if  any  are  faulty,  we  can  identify  more  than  eight  faulty  nodes. 
The  remaining  nodes  are  suspect  (indicated  by  shading  consisting  of 
minuscule  dots).  These  are  divided  into  two  subsets  {n6,  uly  Mg,  M9} 
and  {u  15,  Mi6,  M17,  Wig}.  The  definitely  bad  and  suspect  nodes  com¬ 
prise  the  replacement  set  FR.  Following  the  proof,  let 

Bi  ~  {n4,  u5>  m6,  m7,  m8,  m9}  *Bi  =  {«4»  w5} 

Bi  +  1  =  {M13,  Mu,  M 15 ,  Uj6,  Mn,  U18}  *Bi  + 1  =  {ui3,  Mu}. 

Gi  -  (Ml 0,  Mn,  M12}  G**  =  (mu,  M12}. 

Thus,  FR  =  Bi  UBi+i.  *B,  and  *Bi+\  are  the  first  t  —2  nodes 
in  Bi  and  Bi+X ,  respectively.  The  proof  of  Lemma  1  shows  that 
|G,-|  >  |*2?/+l  j.  This  is  indeed  true  here. 

Following  the  proof,  we  have  Gt  —G*  *  =  (mi0  },  which  is  removed 
and  inserted  immediately  in  front  of  *Bi ,  that  is,  between  M3  and 
m4.  This  yields  the  system  of  Fig.  6.  The  test  results  affected  by 
the  transplant  of  G,  -  G* *  =  {mi0}  are  outlined  in  Fig.  6.  The 
numbers  associated  with  arrows  indicate  the  condition  in  the  proof 
that  specifies  the  test  value.  Considering  the  resulting  syndrome,  we 
find  that  the  definitely  bad,  suspect,  and  definitely  good  nodes  are 
as  shown  in  Fig.  6.  Specifically,  m4  and  M5  are  definitely  bad,  as 
before.  un  and  m14,  which  were  definitely  bad,  are  now  suspect. 
All  suspect  nodes  before  the  change  are  still  suspect.  However,  Mn 
and  M12,  which  were  definitely  good,  are  now  suspect  (for  example, 
(m4,  M5,  M6,  m7,  m8,  m9,  Mu,  M]2 }  can  be  a  set  of  faulty  nodes  which 
produces  the  syndrome  shown).  Thus,  the  total  number  of  definitely 
bad  and  suspect  nodes  is  larger  by  2.  This  results  in  a  contradiction; 
the  claim  that  the  original  replacement  set  is  maximal  is  wrong. 
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Fig.  5.  Example  of  a  Z)1>2(19)  design  with  a  replacement  set  that  does  not 
have  consecutive  nodes. 
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Fig.  6.  The  system  of  Fig.  5  rearranged  to  produce  a  larger  replacement 
set. 


Interestingly,  the  resulting  replacement  set  consists  of  consecutive 
nodes.  Indeed,  it  is  a  maximal  replacement  set. 
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Algorithm-Based  Fault  Detection  for  Signal  Processing 
Applications 
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Abstract— The  increasing  demands  for  high-performance  signal  pro¬ 
cessing  along  with  the  availability  of  inexpensive  high-performance  pro¬ 
cessors  have  resulted  in  numerous  proposals  for  special-purpose  ar¬ 
ray  processors  for  signal  processing  applications.  This  correspondence 
presents  a  functional-level  concurrent  error-detection  scheme  for  such 
VLSI  signal  processing  architectures  proposed  for  the  FFT  and  QR  fac¬ 
torization.  Some  basic  properties  involved  in  such  computations  are 
used  to  check  the  correctness  of  the  computed  output  values.  This 
fault  detection  scheme  is  shown  to  be  applicable  to  a  class  of  prob¬ 
lems  rather  than  a  particular  problem  unlike  the  earlier  algorithm- based 
error-detection  techniques.  The  effects  of  roundoff/truncation  errors  due 
to  finite-precision  arithmetic  are  evaluated.  It  is  shown  that  the  error 
coverage  is  high  with  large  word  sizes. 

Index  Terms—  Algorithm-based  fault  detection,  FFT,  finite- precision 
errors,  QR  factorization,  signal  processing  applications. 
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